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Small Classes and Their Effects 

Bruce J. Biddle 
& 

David C. Berliner 

Interest in class size is widespread today. Debates often appear about "ideal" class size, and 
controversial efforts to reduce class size have appeared at both the federal level and in various states 
around the nation. Moreover, a good deal of research has appeared on class size, and controversies 
have also arisen about that research and its findings. What types of research have appeared on class 
size to date, what findings have surfaced from that research and how can we explain those findings, why 
have those findings provoked controversy, and what should we conclude now about class-size policies 
from research on the topic? 



The Issue 

Conflict has often appeared concerning ideal class size. Educators have long argued that students do 
better in smaller classes, but fiscal conservatives and those who want to reduce public school funding 
have claimed that students do just as well in larger classes, and politicians often quarrel about whether 
we should spend additional tax dollars to reduce class sizes. 

Responding to this debate, a large amount of research has also appeared on the impact of class 
size-indeed, more studies may have surfaced for this topic than for any other question in education! 
One might assume that this huge research effort would have now provided clear answers about the 
effects of class size, but--no-sharp disagreements have also appeared about findings from these 
studies. Consider the following only-too-typical quotes about class size from scholar-activists: 



This research leaves no doubt that small classes have an advantage over larger classes in reading and 
math in the early primary grades. 

Jeremy Finn and Charles Achilles (1990, p. 573) 



There is no credible evidence that across-the-board reductions in class size boost pupil 
achievement 



Chester Finn and Michael Petrilli (1998, p. 2) 
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Or these from reviewers of class-size studies: 

Large reductions in school class size promise learning benefits of a magnitude commonly believed not 
within the power of educators to achieve. 

Gene Glass, Leonard Cahen, Mary Lee Smith, 
and Nikola Filby (1982, p. 50) 

This article has concentrated on the limited task of reviewing the evidence on ... reducing class size. The 
surprising finding is that the evidence does not offer much reason to expect a systematic effect from overall 
class size reduction policies. 

Eric Hanushek (1999, p. 158) 



Or these from advocacy groups: 



Taken together, these studies ... provide compelling evidence that reducing class size, particularly for 
younger children, will have a positive effect on student achievement. 

Dan Murphy & Bella Rosenberg-writing as 

representatives of The American Federation 
of Teachers (1998, p. 3) 

There's no evidence that smaller class sizes alone lead to higher student achievement. 

Nina Shokraii Rees & Kirk Johnson — 
writing as representatives of 
The Heritage Foundation (2000, p. 1) 

It is easy to understand why The American Federation of Teachers and The Heritage Foundation 
would sponsor such conflicting judgments. After all, the former group speaks for public-school teachers 
who strongly favor smaller classes, whereas the latter stands foursquare against unions in education 
and increases in public spending. But why on earth have scholars and reviewers come to such 
divergent views about research on class size, and what does the evidence really say? Further, if small 
classes generate benefits, why should such benefits appear, and do those benefits apply to all (or 
merely some) students, levels of education, topics of instruction, and forms of advantage? 

Studies and Their Findings 

Early Small Field Experiments 

To answer these questions we must look at several traditions of research beginning first with early 
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experiments on class size. As a rule, experiments are created when investigators are able to assign 
research subjects to "experimental" and "control" treatments randomly and then compare results for those 
conditions. Experiments are popular because they involve intervention in the natural world and are 
thought to provide information about causes and effects. Some experiments with people are done in 
laboratories where environmental conditions may be controlled, but experiments on class size are nearly 
always done in field settings, such as schools, where external conditions can intrude into the design and 
also affect results. (Researchers have learned over the years that schools are very messy contexts in 
which to conduct experiments, although they continue to try to do so.) 

Small experimental (or quasi-experlmental) studies of the Impact of class size are easy to 
organize and have been conducted for years in America. The first such studies seem to have appeared 
In the 1920s, and more than 100 of them have since been reported. Informal reviews of these efforts 
began to appear in the 1960s, and most of these stressed that, based on evidence then available, 
differences in class size seemed to have but little impact. However, by the late 1970s a more 
sophisticated technique for reviewing had been invented, meta-analysis, and reviewers quickly applied 
this technique to results from these early experiments.^ Although the authors of these reviews have 
quarreled about details of their conclusions and the best way to apply meta-analyses to class-size 
studies, a consensus has gradually emerged from their efforts about findings these studies had 
developed: 

“ Short-term exposure to small classes had been found to increase measured student 
achievements, but the extra gains it had generated were often minor : 

“ Extra gains associated with small classes had appeared mainly when class size was 
reduced to less than 20 students : 

“ Extra gains associated with small classes had been stronger for the early grades : and 

- Extra gains associated with small classes had been stronger for students who came 
from groups that were traditionally disadvantaged in education . 

However, these early class-size experiments had usually involved only small samples, short-term 
exposure to small classes, but one measure of student success, and a single educational context (such 
as one school or school district)--and some had employed poor designs which made their results 
questionable--so it was difficult to assess what would happen if students were exposed to small classes 

^ See Glass & Smith (1979); Educational Research Service (1980); Glass, Cahen, Smith, & Rlby (1982); Hedges & Stock (1983); Slavin 
(1984); Robinson & Wrltebols (1986); Robinson (1990); Mosteller, Light & Sachs (1996). in brief, meta-analysis involves the statistical assembly of 
results from small-but-similar studies so that one can estimate the effects that should appear in the population represented by those studies. Meta- 
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for longer periods of time and whether early small-class advantages were limited in scope and 
sustainable. Different kinds of research would presumably be needed if one were to answer these latter 
questions. 

Surveys and Econometric Studies 

Another tradition of research, based on survey designs, has also provided evidence on class size and its 
effects. This second type of research relies on the fact that naturally occurring differences in school and 
classroom characteristics appear in American education and asks whether these differences are 
associated with student outcomes. To answer this question, investigators collect and compare survey 
data from students, teachers, school administrators, and public records. 

When well-designed, surveys can examine a broad range of educational contexts and topics and 
offer opportunity to study the impacts of variables that can not (or should not) be manipulated in 
experiments-such as gender, minority status, and childhood poverty. On the other hand, survey research 
has difficulty establishing relations between causes and effects. Why should this be so? Let us assume 
that a survey examines a sample of schools where average class size varies and discovers that those 
schools with smaller classes also have higher levels of student achievement. Does this mean that the 
former necessarily generated the latter? Hardly. Those schools with smaller classes might also have had 
more qualified teachers, better equipment, more up-to-date curricula, newer school buildings, more 
students from affluent homes, a more-supportive community environment, or other advantages, and these 
latter factors may also have helped to generate higher levels of achievement. Thus, to establish the case 
for a causal relation between class size and student outcomes with survey data, one must use statistical 
processes which weed out (or "control for”) the competing effects of other variables that might also be 
affecting students.^ 

Bearing this argument in mind, we look now at survey evidence on the effects of class size. 
Serious surveys on American education may be said to have begun in the 1960s with the famous 
Coleman Report .^ This massive, federally funded study involved a national sample and took on many 
issues then facing educators and politicians in the country. Today it is more often remembered, however, 
for its startling claim that although student achievements are strongly influenced by the qualities of their 

analyses are not without controversy, but they provide useful information when large-scale studies are not available. 

2 

This is a difficult but not impossible task. Take, for example, surveys which studied the relation between cigarette smoking and lung cancer. For 
years critics would complain that those surveys had not yet established a causal relation between smoking and cancer because those surveys had not 
yet examined other cardal events that might also cause cancer (such as genetic factors, living in stressful or polluted cities, poor nutrition, and the like), 
but additiorial surveys would shortly appear thereafter which controlled for all these factors and more, and eventually thoughtful persons decided that 
the case h^ been made, that cigarette smoking did indeed cause lung cancer. 
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families and peers, the qualities of their schools and classrooms have but little impact. 

This claim was greeted with dismay by educators and was endorsed with enthusiasm by fiscal 
conservatives and those critical of public education. But somehow, amidst the welter of subsequent 
disputes, neither group seemed to have noticed that the methods reported in the Coleman Report's study 
were seriously flawed and its supposed findings were even then being challenged by thoughtful critics. 
So, instead of questioning it, the public began to assume that the Report's peculiar claim about the 
supposedly weak effects of schools and classrooms was an established "fact." 

Since then scores of more modest surveys have been conducted seeking to establish whether 
differences in school funding or those things which funds can buy--such as small-class sizes-are or are 
not associated with desired educational outcomes. Many of these have come from economists who 
wanted to test mathematical models for predicting educational outcomes, and most have involved 
questionable design features and small samples that did not represent the wide range of American 
schools, classrooms, or students. 

Nevertheless, enough of these surveys had appeared by the late 1970s that reviews seemed to be 
in order, and in the early 1980s Eric Hanushek, also an economist, began to publish a series of articles 
reviewing these works and discussing their supposed implications. Hanushek seems to have been 
committed, from the beginning, to a version of economic theory which argues that public schools are 
ineffective and should be replaced by a marketplace of competing private schools,'’ and it is small wonder 
that his reviews have regularly concluded that differences in public school funding~as well as things that 
funds can buy-are not associated with educational outcomes. Most of the studies Hanushek has 
reviewed did not provide evidence on class size, but some seemed to focus on the class-size issue, and 
after reviewing the latter too, Hanushek has announced that class size also appears to have little impact.® 

However, Hanushek's methods and conclusions have been challenged on several grounds. Meta- 
analysts, such as Larry Hedges and Rob Greenwald, have pointed out that Hanushek merely counts the 
number of effects he finds that are "statistically significant," but since most of those effects are based on 
studies with small samples, it is nearly inevitable that he would find but few "significant" effects. In 
contrast, when those effects are added together in meta-analyses, the overall results suggest that 

3 

Coleman et al. (1966). 

* Current versions of this theory seem to have evolved from the writings of two influential figures in economics, Milton Freedman (1962) and 
Kenneth Boulding (1972). It has recently been championed by John Chubb & Terry Moe (1990) among others. 
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differences in school funding and those things that funds can buy--such as smaller classes--do, indeed, 
have an impact.® 

Another economist, Alan Krueger, has also observed that Haniishek does not base his findings on 
the number of studies he reviews but rather on the number of different findings reported in those studies-- 
a procedure fraught with potential bias-and that results supporting the importance of class size pop up 
quickly if one corrects for these biases.^ 

And several commentators® have pointed out that many of the supposed "class-size" studies 
Hanushek reviews do not examine class size directly but rather a proxy measure presumed to represent 
it-student-teacher ratio, defined as the number of students divided by the number of "teachers” reported 
for a school or school district. The troubles with this latter measure are that it ignores how students and 
teachers are allocated to classrooms and often includes counts of administrators, nurses, counselors, 
coaches, specialty teachers, and other professionals who rarely appear in classrooms at all. Such a ratio 
is, then, a poor way to estimate the number of students actually taught by teachers in specific 
classrooms, and it is the latter we need to know about if we are to study the effects of class size. 

Hanushek has not responded well to such criticisms; rather, he has found reasons to quarrel with 
their details and to continue publishing reviews, based on methods that others find questionable, which 
claim that level of school funding and the things those funds can buy-such as smaller classes-have but 
few discernable effects.® These efforts have endeared Hanushek to political conservatives who have 
extolled his conclusions, complimented his efforts, and asked him to testify in various forums where 
class-size issues are debated. And in return, Hanushek has embedded his conclusion about the 
supposed lack of class-size effects in a broader endorsement of conservative educational agenda.^® 
Given these activities and allegiances, it is no longer possible to give credence to Hanushek’s judgments 
about the impact of class size. 

But does this mean that one should now conclude that small, econometric surveys ^ confirm a 



® See Hanushek (1986; 1996; 1997; 1999). 

® See Hedges, Laine, & Greenwald (1994); Greenwald, Hedges, & Laine (1996); and Hedges & Greenwald (1996). 

' See Krueger (2000). 

® See Finn & Ardiilles (1 999), for example. 

“ Worse, although Hanushek is clearly aware that student-teacher ratio is not the same thing as class size (see Hanushek, 1999, 
p. 145), he has continued to argue that his reviews of literature based on the former imply findings about the latter. 

See Hanushek (1995). 
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class-size effect? Actually, this is also unwise. Many of these small surveys have used inappropriate 
samples, most have not employed controls for other classroom or school characteristics whose effects 
might be confused with those of class size, and nearly all have used measures of student-teacher ratio 
rather than class size. Thus, the bulk of this literature has provided very little information about the effects 
of class size in the real world. 

Fortunately, a few well-designed, large-scale surveys have appeared on the subject, and we may 
gain ideas about class-size impact by looking at their findings.^^ To illustrate, in 1966 Ronald Ferguson 
and Helen Ladd reported a survey in which they examined average gains in achievement scores for 
fourth-grade students from all schools in the state of Alabama. After controlling for various measures of 
home advantage and teacher qualification, they found sizable effects for class size. In addition, results 
from the Ferguson and Ladd study suggest that small-class advantages for fourth-grade students are 
likely to appear for more than one type of subject matter. 

Or, to take another example, Marta Elliott recently reported a large survey of mathematics and 
science achievements for eighth-grade students, based on data from across the country obtained in the 
National Education Longitudinal Survey of 1988. She found that more student achievement was 
associated with higher level of qualifications possessed by their teachers and the use of more effective 
pedagogic techniques, but it was riot significantly associated with small-class size. 

These results suggest two modifications for findings we expressed earlier: 

- Long-term exposure to small classes in the early grades has also been found to 
increase measured student achievements, and the extra gains it generates 
may be substantial: and 

, - Extra gains associated with small classes may not appear at all at the upper- grade, 

middle-school, and secondary levels . 

Two additional problems should also be noted about survey efforts to date. For one, authors and 
reviewers of these studies have often seemed to be unaware of experimental research on the effects of 
small classes. This is too bad. Experiments and surveys generate differing but complementary types of 
evidence, but theories and policy recommendations concerned with small classes and their effects must 
surely accommodate aH types of evidence on the subject. 

For another, surveys can make a particularly strong contribution when they explore how events 
vary among different sectors of the population. When applied to the study of small classes, for example. 
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this means that survey evidence should eventually be able to tell us whether small-class effects differ 
among students depending on their gender, race, poverty status, or home condition; among various types 
of classrooms and schools; among differing educational topics; and among city-center, suburban, and 
rural communities, various states or regions in the nation, and differing ethnic and national contexts. 
Unfortunately, broad survey evidence concerning these issues has so far been hard to find. 

Trial Programs and Large Field Experiments 

Fortunately, some of the shortcomings of survey studies have been partly dealt with by other types of 
small-class research. In the 1980s political debates about the effects of small classes began to appear in 
America's state legislatures, and some of these have generated trial programs or large-scale field 
experiments. We turn now to some of these latter efforts. 

Indiana’s Project Prime Time . We begin with a trial program in Indiana that is known today as 
"Project Prime Time.”^^ This effort began in 1981 when the Indiana legislature allocated $300,000 for a 
two-year study of the effects of reducing class size in the early grades within a sample of 24 public 
schools. But after two semesters the results of this initial study were so impressive that additional funds 
were allocated to reduce class sizes in a]l state schools beginning with first-grade classes in the 1984-85 
school year, and the program was gradually extended so as to involve grades K-3 by 1987-88. 

In its latter form. Project Prime Time reduced class sizes to an average of 18 students per teacher 
(compared with more than 25 students per class before the project began), but since this treatment was 
applied to all K-3 classrooms in the state, it was not possible to compare results for small classes with a 
comparable group of larger classes. However, some schools in the state had experienced small classes 
before Project Prime Time began, so it was possible to compare achievement records for the latter with 
those from schools which had reduced class sizes. This comparison was made for second-grade 
achievement records (sampled from six school districts that had , compared with three that had not , 
reduced class sizes), and the analysts found substantially larger gains for reading and mathematics 
achievement for students where class size had been reduced. 

This sounded promising, but critics soon pounced on the design of Project Prime Time, decrying 
the fact that within it students had not been assigned to experimental and control groups on a random 
basis, pointing out that other changes in state school policy had also been adopted during the project, and 



" See, for example, Ferguson (1991); Ferguson & Ladd (1996); Wenglinsky (1997a, b); or Elliott (1998). 

See Indiana Department of Public Instruction (1983); Sava (1984); and McGivern, Gilman & Tillitski (1989). 
” See McGivern et al. (1989). 
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suggesting that teachers in the state knew how results from the trial program were supposed to come out, 
so they were motivated to make certain that small classes did, indeed, achieve better results. Indiana 
students probably ^ benefit from the project, but a persuasive case for small classes had not yet been 
made. Clearly, a better experiment was needed. 

The Tennessee STAR Project . Such an experiment would shortly appear in a study known today 
as the Tennessee STAR (Student/Jeacher Achievement Ratio) Project. This study was arguably the 
largest, best-designed, field experiment that has ever appeared for education and has provoked a great 
deal of interest, so we shall describe it carefully. (Major persons involved in organizing and promoting the 
STAR project have included Charles Achilles, Jeremy Finn, Helen Pate-Bain, Tennessee State 
Representative Steve Cobb, Frederick Mosteller, and Alan Krueger.)’“ 

The STAR Project was begun in the mid-1980s when the Tennessee legislature funded an initial 
four-year study seeking to compare achievements for early-grade students who would be assigned 
randomly to one of three treatment conditions: standard classes (with one certificated teacher and more 
than 20 students); supplemented classes (with one teacher and a full-time, non-certificated teacher's aid); 
and small classes (with one teacher and about 15 students). It began with a cohort of students who 
entered kindergarten in the Autumn of 1985, and the study design called for each of those students to 
attend the same type of class for four years. To control for unwanted effects associated with schools and 
communities, each school participating in the study was to sponsor all three types of classes, and 
students and teachers within those schools were to be assigned to treatment conditions randomly. 
Participating teachers were given no prior training for the type of class they were to teach. 

Primary schools from throughout the state were invited to be in the study, but each school had to 
agree to remain in it for four years and to have at least 57 kindergarten-age children available to 
participate (so that at least one of each type of class could be set up within the school). Participating 
schools were also to receive no additional support other than funds to hire additional teachers and aids- 
both available within the state at that time-and each school had to supply the class rooms needed for the 
project. These constraints meant that troubled schools and those which disapproved of the study-as well 
as schools that were too small, too crowded, or too underfunded-would not participate in it, and in fact the 
sample for the first year of the project involved "only" 79 participating schools, 328 classrooms, and about 
6,300 students. Those schools came from all corners of the state, however, and represented urban. 



* Readers interested in further details about STAR may want to consult Folger et al. (1989); Finn & Achilles (1990); Word et al. 
(1990); Mosteller (1995); Grissmer et al. (1999); Krueger (1999); Nye, Hedges, & Konstantopoulos (1999, 2000); Boyd-Zaharias & 
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inner-city, suburban, and rural school districts. As well, the student sample contained both majority 
students and a sizable number of African-Americans as well as students from impoverished homes who 
were then receiving free lunches at their schools under federal support programs. 

By the beginning of the 1986-87 school year, the second year of the study, several events had 
cropped up which meant that the sample for the project had to be revised. For one thing, American 
families move around a lot, and this meant that some families whose children had participated in STAR 
classes the previous year were by then living elsewhere. For another, some students had been forced to 
drop out of the study for reasons of poor health or because they had been held back for a second year of 
kindergarten. These factors meant that there were vacant seats in all three types of STAR classes at the 
beginning of year two, but other families had also by then moved into districts served by STAR schools, 
and their children were available to fill those vacant seats. As well, attending kindergarten was not then 
mandatory in Tennessee, and this meant that some new students in STAR districts were actually entering 
school for the first time that year. 

These factors meant that new students were placed in all three types of STAR classes at the 
beginning of the second year of the study. In addition, some parents sought to move their children from 
one type of STAR class to another, but these requests were resisted by school authorities and those 
conducting the study (although in a few cases students were allowed to move from a standard class to a 
supplemented class or vice versa). Similar, although less dramatic, shifts in the sample were also to take 
place at the beginning of the 1987-88 and 1988-89 school years. By the end of the initial, four-year study, 
then, some students had been exposed to a given type of STAR class -small classes, for example-for 
four years, but others had spent only three, two, or one year in such classes. These shifts in the student 
sample might possibly have biased STAR results, but Alan Krueger performed a careful analysis of 
student migration during the four-year experiment and concluded that such bias was minimal.^® 

To assess how well students were doing in the STAR study, towards the end of each year they 
were given the Stanford Achievement Test battery which generated separate achievement scores for 
reading, word-study skills, and mathematics. When results from these tests were examined, a number of 
findings appeared. First, it quickly became clear that results from standard classes and supplemented 
classes were quite similar. (Thus, few advantages appeared merely because untrained aids were added 
to classes of standard size.) However, results for small classes were far more dramatic suggesting that: 



Pate-Bain (2000); Finn, Gerber, Achilies, & Boyd-Zaharias (2001); or Knjeger & Whitmore (2001). 
See Krueger (1999). 
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-- Long-term exposure to small classes (in the early grades) had generated substantially 
higher levels of achievement : and 

" The extra gains associated with long-term exposure to small classes (in the early 
grades) were greater the longer students were exposed to those classes . 

These two effects are displayed in Figure 1 which expresses the advantages found in STAR for 
small classes, when compared with standard classes, as months of greater reading achievement for 
average students.^® To illustrate, when comparing reading achievement scores for students who were 
exposed to small versus standard classes over the four years of the study, STAR investigators found that 
the former were 0.5 months ahead by the end of the kindergarten year, 1.9 months ahead at the end of 
first grade, 5.6 months ahead in second grade, and 7.1 months ahead by the end of grade three. Note 
also that achievement advantages were smaller, although still impressive, for students who were only 
exposed to three, two, or one year of small classes. (Similar results indicating small-class advantages 
were also obtained for word-study skills and mathematics, although details for the three topics differed 
slightly.) 




Grade When Test Administered 



Figure 1. Average Months of Grade-Equivalent Advantage in 
Reading Achievement Scores for Students in Small Classes 



Figures 1 and 2 report data that originally appeared in Finn et al. (2001) and were prepared with kind help from Jeremy Finn. 
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In addition, STAR investigators found that small-class advantages appeared for all types of 
students participating in the study and were quite similar for boys and girls. However, those advantages 
were greater for impoverished students, African-American students, and students from inner-city schools. 
Thus: 

Although all types of students experienced extra gains from long-term exposure to small 
classes (in the early grades), those gains were greater for students who are 
traditionally disadvantaged in education . 

These initial STAR findings were certainly impressive, but would they “last”? Would students who 
had been exposed to small classes early on retain their extra gains when returned to standard classes in 
grade four? To answer this question, the Tennessee legislature authorized a second study to examine 
outcomes during subsequent years for students who had originally attended STAR classes. 

It is useful to provide a time perspective for this second study. If they were not "held back" for any 
reason, STAR students would have been in fourth grade during the 1989/90 school year, grade six in 
1991/92, grade eight in 1993/ 94, and twelfth grade in 1997/98. During most of these years their end-of- 
the-year achievements were assessed by means of another test battery, the Comprehensive Tests of 
Basic Skills, which provided scores for four topics: reading, mathematics, science, and social science. 
Once again, it was possible to express these scores as months of average achievement for students from 
the different types of STAR classes, and when this was done, it was found that average students who had 
attended small classes were months ahead of those from standard classes for each topic assessed at 
each grade level. Results for some of these years are displayed in Figure 2 which shows, for example, 
that 
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Figure 2. Average Months of Grade-Equivalent Advantage in 
Achievement Scores for Students Who Experienced One or More 
Years of Small Classes 



when typical students who had experienced one or more years of small classes in the early grades 
reached grade eight, they were 4.1 months ahead in reading, 3.4 months ahead in mathematics, 4.3 
months ahead in science, and 4.8 months ahead in social science. 

Students who had attended small classes also enjoyed other advantages in the upper grades. 
They earned better grades on average, fewer of them had dropped out of the schools they were attending, 
and over the years fewer of them had been retained in grade. And once they entered high school, more 
small-class students opted to learn foreign languages, more took advanced-level courses, more were to 
be found in the top 25% of their classes, more graduated from high school, and more volunteered to take 
the ACT and SAT exams (the major tests now taken by high school seniors who aspire to enter colleges 
and universities). Moreover, initial published results have suggested that these upper-grade effects were 
also larger for students who are traditionally disadvantaged in education.’^ 

To examine merely two of these effects, look at Figure 3 which displays the percentages of 
students who, having experienced small classes or standard classes in the early grades, opted to take the 
ACT or SAT when high-school seniors.’® As can be seen, among all students, roughly 44% of those from 
small classes took one or both of these tests whereas only 40% of students from standard classes did so. 
However, the difference was far greater for African-American students. In the latter case, roughly 40% of 
small-class students took the ACT or SAT whereas for students from standard classes the figure was only 



” See Krueger & Whitmore (2001). 

Data for Figure 3 came from Krueger & Whitmore (2001), and the figure was prepared with kind help from Alan Krueger. 
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32%. (Or to put this latter finding differently, early attendance in small classes allowed black students to 
overcome more than hajf of the traditional disadvantages they have displayed in rates for participation in 
the ACT and SAT testing programs.) 

These results indicate additional STAR findings: 

-- The extra gains found for long-term attendance in small classes (in the early grades) 
continued to appear when students were returned to standard classes in the 
upper grades : 

~ Extra gains associated with long-term attendance in small classes (in the early 
grades) appeared not only for tests of measured achievement but also for other 
measures of success in education : and 
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(Initial results indicate that) the greater gains experienced by students from groups 
. that are traditionally disadvantaged for education were retained when those 
students were returned to standard classes. 




All White Black 
students students students 




Figure 3. Percent of Students Who Took the ACT or 
SAT College Entrance Exam by Early-Grade Class Type 



Taken together, findings from the STAR project have certainly been impressive, but lest we be 
tempted to conclude they are "definitive," we should also think about questions that have been raised 
about STAR. For one thing, the student sample involved in the STAR project did not quite match the 
American population; very few Hispanics, Native American, and immigrant (non-English-speaking) families 
were living in Tennessee in the middle-1980s, thus few students from such groups participated in STAR. 
For another, news about the greater achievement gains of small classes leaked out early during the STAR 
project, and one wonders how this affected participating teachers and why parents whose children had 
been assigned to standard and supplemented classes did not then demand that their children be 
reassigned to small classes. And for a third, schools participating in STAR had volunteered to do so, and 
it is possible that the teachers and principals in those schools had particularly strong interests in new 
ideas and innovation. Questions such as these do not imply that we should reject findings from STAR, but 
they serve to remind us that STAR project was but a single study and that other evidence would also be 
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needed to nail down class-size effects. 

Wisconsin's SAGE Program . As findings from STAR have gradually become known, they have 
prompted class-reduction efforts in various venues around the nation. One type of effort has focused on 
the idea that Americans can provide targeted help for disadvantaged students by increasing the number of 
small, early-grade classes in neighborhoods where those students are clustered. 

An early example of such a program began in Tennessee in 1989 and was conducted under the 
supervision of STAR investigators. Within this program, class sizes were reduced for grades K-3 in 17 
school districts where average family income was low and the numbers of students receiving free lunches 
in schools was high. Results indicated that students from small classes in these districts improved their 
achievement scores for both reading and mathematics (when compared both with previous performances 
by students in those districts and with other schools in the state), but this program did not involve control 
groups of classrooms, thus it was more a demonstration program than an experiment. 

Other projects, focused on small classes in the early grades and influenced by STAR results, 
were begun in North Carolina, in 1991, within Burke and Guilford Counties where many students were 
then receiving subsidized lunches. These projects compared results for small and standard classes and 
found small classes to be superior for various measures of academic achievement.’® However, the 
projects were quite small in scope. 

Still other small-class initiatives have appeared in other corners of the nation, such as Michigan, 
Tennessee, Nevada, and Buffalo, New York. However, a much larger trial program, focused on the needs 
of disadvantaged students and reflecting leadership by Alex Molnar, began during the 1996/97 school 
year in Wisconsin.^® This effort, termed the Student Achievement Guarantee in Education (SAGE) 
Program, was designed as a five-year pilot project for K-3 classes in school districts where at least 50% of 
children were living below the poverty level. Although all schools in these districts were invited to apply for 
the program, only one school in each such district was allowed to participate at the beginning (except in 
Milwaukee County which was allowed up to ten SAGE schools), and no additional schools were to be 
added after the program had begun. Funding was set at $2,000 per low-income student enrolled in SAGE 
classrooms. No school district applying to participate was turned down, and 30 schools (in 21 districts) 
began the program at the kindergarten and first-grade levels in 1996. Second grade was added for these 



See Achilles, Harman, & Egelson (1995) and Achilles (1999) for descriptions of these projects. 
See Molnar et al. (1999); Zahorik (1 999); Molnar et al. (2000). 
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schools in 1997/98 and third grade in 1998-99.^’ 

In theory, the initial SAGE program involved four interventions: (a) reducing average class size to 
15 students per teacher for grades K-3, (b) establishing "lighted school-house" procedures in participating 
schools from early morning through late evening, (c) developing "rigorous" curricula, and (d) creating a 
system of staff development and professional accountability. However, and for various reasons, only the 
class-size-reduction intervention was uniformly implemented among SAGE schools. This was 
accomplished mainly by assigning 15 or fewer students to teachers within standard classrooms, but 
(because trial programs and field experiments are done in real-world settings) in a few cases other 
strategies were also employed for reducing student-teacher ratios. The latter included assigning two 
teachers to larger classrooms, fitting temporary walls within large classrooms so as to create space for 
"two small classrooms," and employing "floating teachers" who provided supplementary instructional help 
for reading, language arts, and mathematics instruction. 

Outcomes of the program have been assessed by comparing results for SAGE schools that 
adopted small classes with results for other schools from the same districts, having normal class sizes, 
that resemble SAGE schools in average family income, prior records of achievement in reading, K-3 
enrollment, and racial composition. Findings so far available have indicated larger gains for students from 
small classes-in achievement scores for language arts, reading, and mathematics--that are roughly 
comparable to those from the STAR Project. In addition, as in STAR results, relatively larger gains have 
been found for African-American students. (In contrast, preliminary analyses suggest that assigning two 
teachers to larger classrooms and employing "floating teachers" did not create larger gains for students.) 

Since findings for the initial SAGE effort were announced, the Wisconsin legislature has come 
under pressure to expand the scope of their small-class initiative, and they have now extended the SAGE 
program to other primary schools in the state. Thus, what began initially as a small trial project has now 
blossomed into a statewide program that makes small classes in the early grades available for schools 
serving needy students. 

The California Class Size Reduction Program . The SAGE program began in 1996/97, and the 
same year saw the beginning of a far more controversial class-size-reduction program in California.^^ 



Note that several conditions \Nithin the SAGE program were similar to those of STAR. SAGE also involved schools that had 
volunteered to participate in the program. Those schools were also provided sufficient funds to hire additional teachers, and an 
adequate supply of credentialed teachers was again available \Nithin the state. However, SAGE involved somewhat more Hispanic, 
Asian, and Native American students than had STAR. 

“ See Hyman (1997); lllig (1997); Schwartz & Warren (1997); Korostoff (1998); Kuo (1999); Bohrnstedt, Stecher, & Wiley, (2000); 
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Numbers of immigrant, non-English-speaking families have soared within "The Golden State" in recent 
years while per-capita fiscal support for public education has been declining, and by 1996 California 
schools were suffering many problems and were ranked last in the nation by major measures of 
achievement. However, a fiscal windfall became available that year, so in May of 1996 California's then 
Governor, Pete Wilson, announced a new policy that provided $650 each per student (later increased to 
$800) for all primary schools that would agree to reduce class size in the early grades from the state-wide 
average of more than 28 students per teacher to not more than 20 students in each class. 

Several problems with this program quickly surfaced. For one, the definition it mandated for 
"small classes" differed from that recommended elsewhere and investigated in the studies we have 
reviewed above. Under this definition, in fact, California primary schools were being asked to set up 
"small classes" which matched the sizes of "standard classes" in some other states! On the other hand, 
some schools in California had previously been trying to cope with 30 or more students per classroom in 
the early grades, so for them a reduction to 20 students was actually an improvement. 

For a second, per-student funding for the program was clearly inadequate. (Contrast the $2,000 
per student provided under SAGE with the $650 or $800 per student being offered under the California 
initiative.) Nevertheless, the lure of additional funding has proven seductive, and most California school 
districts have now applied to participate in the program. This has imposed serious consequences on 
poorer school districts which have had to abolish other needed activities to find the extra funds required to 
pay additional teachers to staff "small" classes. In effect, then, the program has created (rather than 
solved) problems for underfunded school districts. 

In addition, in the mid-1990s California's education system was facing several problems that 
threatened the class-size-reduction initiative-among them serious overcrowding in many of its primary 
schools and a huge shortage of well-trained, certificated teachers. To cope with the first of these 
problems some schools have created spaces for "small classes" by cannibalizing other needed facilities- 
special education quarters, child care centers, music and art rooms, computer laboratories, libraries, 
gymnasia, and teachers' lounges for example-whereas others have had to tap into their operating 
budgets to buy portable classrooms which has meant delays in paying for badly needed curricular 
materials or repairs for deteriorating school buildings. To cope with the second, many school districts 
have had to hire new "teachers" for their "small classes" who were not certificated and had no prior 
training for their jobs. 

So far, results from the California program have been only modest. Informal evidence suggests 
Stasz & Stecher (2000); Stecher, Bohrnstedt, Kirst, McRobbie, & Williams (2001). 
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that most students, parents, and teachers are pleased with the smaller classes that have appeared in their 
schools. And comparisons between the measured achievements of third-grade students from districts that 
did and did not participate in early phases of the program have indicated minor advantages for "small" 
classes. However, these latter effects have been smaller than those reported for the STAR and SAGE 
programs. 

In many ways, the California initiative has provided a near-textbook case of how not to reduce 
class size within a specific state. Within California; no trial program was conducted to explore class-size- 
reduction options; a definition of "small classes" was adopted that contradicted prior evidence and the 
experiences of other states; inadequate funds were provided to pay for the initiative; and serious 
problems were ignored associated with overcrowded schools and a shortage of qualified teachers in the 
state. Given such an event history, it is small wonder that outcomes of the California initiative have been 
weak. Indeed, this example should serve to remind us that smaller classes are not an educational 
panacea-that in order to be effective, programs for reducing class size should be planned with care and 
with thought given to the other needs and strengths of existing school systems. 



What Do We Know About Small Classes Today? 



Major Conclusions 

Given findings from these different types of research, what should we conclude today about the effects of 
small classes? Although the results of individual studies are always questionable, a host of different 
studies have now appeared on the effects of small classes, and those studies suggest a number of 
general conclusions: 



H When it is planned thoughtfully and funded adequately, long-term exposure to 
small classes in the early grades generates substantial advantages for 
students in American schools, and those extra gains are greater the longer 
students are exposed to those classes . 

H Extra gains from small classes in the early grades are larger when class size is 
reduced to less than 20 students : 



H Extra gains from small classes in the early grades are found for various academic 
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topics and for both traditional measures of student achievement and other 
indicators of student success : 

M Extra gains from small classes in the early grades are retained when students are 
returned to standard-size classrooms, and these gains are still present in the 
upper grades, the middle school, and the high school years : 

H Although extra gains from small classes in the early grades appear for all types of 
students (and seem to apply eguallv to boys and girls), they are greater for 
students who have traditionally been disadvantaged for education : 

H (Initial results indicate that) the greater gains associated with small classes in the 
early grades for students who have traditionally been educationally 
disadvantaged are also carried forward into the upper grades and bevond: 
and 

M Evidence for the possible advantages of small classes in the upper grades and 
high school is so far inconclusive . 

Tentative Theories 

Why should small classes have such impressive effects when employed in the early grades? On the face 
of it, to reduce the number of students in classes during the first four years of school would seem to be a 
mechanical step. Why should such an action generate extra gains for students, why should it provoke 
such a wide range of gains, why should those gains persist when students are older, and why should they 
be greater for students who have come from educationally disadvantaged groups? 

Theories concerning these issues have fallen largely into two camps. Most theorists have 
focused on the teacher and have reasoned that small classes work their magic because interactions 
between the teacher and individual students are improved in the small-class context. To exemplify such 
theories, we turn first to Frederick Mosteller who argued that: 



Reducing [the size of classes in the early grades] reduces the distractions in the room and gives the teacher 
more time to devote to each child.... When children first come to school, they are confronted w/ith many 
changes and much confusion. They come into this ne\« setting from a variety of homes and circumstances. 
Many need training in paying attention, carrying out tasks, and interacting w/ith others in a \«orking situation. 
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In other words, when children start school, they need to learn to cooperate with others, to learn to learn, and 
generally to get oriented to being students. (1 995, p. 125) 

Thus, reducing class size in the early grades "works," at least in part, because it is in these grades 
that children are first learning about the rules of standard classroom culture and forming ideas about 
whether they can cope with education. Many children have difficulty with these tasks, and their efforts are 
greatly aided when they can interact with teachers on a one-to-one basis-a process more likely to take 
place when the class is small. (One-to-one interaction allows teachers to learn more about individual 
students and their needs, thus to help students to develop more useful habits and ideas about themselves 
and their abilities.) In addition, teachers in small classes have higher morale, and this enables them to 
provide a more supportive environment for initial student learning. But learning how to cope well with 
school is basic to educational success, and those students who solve this task when young will thereafter 
carry broad advantages, in the form of more effective habits and more positive self-concepts, that will 
serve them in later years of education (and presumably the wider world beyond). 

The need to master this task confronts children from all walks of life, but it is often a more 
daunting challenge for children who come from impoverished homes, ethnic groups that have suffered 
from discrimination or are unfamiliar with American classroom culture, or urban communities where home 
and community problems interfere with education. Thus, children from such backgrounds have 
traditionally had more difficulty coping with classroom education, and they are more likely to be helped 
when class size is reduced. 

This theory also helps to explain why reducing class size may not generate significant advantages 
if introduced in the upper grades. Older students have long since developed both good and bad habits for 
coping with standard classrooms and evolved both effective and ineffective self-concepts relevant to 
academic subjects, and these are not likely to change just because class size is reduced. Thus, if 
reducing class size has effects at all in the upper grades, those effects would presumably reflect factors 
other than the ones suggested in this first theory. 

The theory also suggests limits for the extra gains one should expect from small classes in the 
early grades. Clearly, students are likely to learn more and develop better attitudes towards education if 
they are exposed to well-trained and enthusiastic teachers, appropriate and challenging curricula, and 
physical environments in their classrooms and schools that support learning. If conditions such as these 
are not also present, then to reduce class size in the early grades will presumably have but little impact. 
Thus, when planning programs for reducing class size, we should also think about the professional 
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development of teachers who will participate in them and the educational and physical contexts in which 
those programs will be placed. 

A second group of theories designed to account for class-size effects focuses, not on the teacher, 
but rather on the classroom environment and student conduct. It has been known for years that discipline 
and classroom management problems interfere with subject matter instruction. It is argued that such 
problems are less prominent in small classes, and this means that in them students are less often 
withdrawn or obstreperous and are more likely to be engaged in learning. Moreover, teacher stress 
should be less likely in small classes, so in the small-class context teachers can provide more support for 
student learning. In addition, studies of instructional groups within classrooms have found that the small 
groups can provide an environment for learning which is quite different from that of the large classroom. 
(In brief, small groups can create supportive contexts in which learning is less competitive and students 
are encouraged to form supportive relationships with one another.) 

Theories such as these suggest that the small-class environment is structurally different from that 
of the large class and that this structural difference generates conditions favoring education. Among 
others, within small classes we should expect to find less time spent on management, higher levels of 
student participation, more time spent on instruction, more teacher support for learning, and more positive 
relations among students. And these processes should lead both to greater subject-matter learning and to 
more positive attitudes about education among students. And again, these effects should be greater for 
students from groups that are traditionally disadvantaged for education and more substantial in the early 
grades (when students are just learning to cope with classrooms). 

The fact that two types of theories have been stressed here does not mean that these theories are 
mutually exclusive. On the contrary, both-as well as related theories-may provide partial insights about 
what typically happens in small classes and why those small-class environments help so many students.^^ 
It is also useful to note that such theories could be assessed directly by collecting other types of evidence, 
particularly from observational studies that compare the details of interaction in early-grade classes of 
various sizes and surveys of the attitudes and self-concepts of students who have been exposed to those 
classes. Unfortunately, good studies of these latter types have been difficult to find.^^ 



Indeed, our theoretical understanding of processes occurring in small classes is still evolving, but a good introduction to the topic may be found in 
a recent paper by Lorin Anderson (2000). 

Most observational studies of small classes to date have focused on the upper grades, have been conducted in other countries, or have not 
contrasted events found in small classes with those found in larger classes. However, suggestive evidence concerning classroom processes may be 
found in Evertson & FJandolph (1989); Achilles (1999); Molnar et al. (2000); Stasz & Stecher (2000); and Achilles, Proul Finn, & Bobbett (2001). 
Studies of the attitudes and self-concepts of students exposed to small classes in the early grades seem not to have appeared as yet. 
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In addition, other research is needed to explore teaching strategies that are most effective in small 
classes and to study small-class effects in social settings and among ethnic groups for which evidence is 
so far skimpy. 

Policy Implications and Actions to Date 

Given the strength of findings from research on small classes, why haven't those findings provoked more 
reform efforts? Although many state legislatures have debated or begun reform initiatives related to class 
size, most primary schools in American today do not operate under policies that mandate small classes for 
early grades. Why not? 

Several reasons may be suggested for this lack of impact, among them ignorance about the 
issue, confusion about the results of class-size research, prejudices against poor and minority children, 
ineffective dissemination of results from research, and the politicizing of debates about class-size effects 
and their implications.^® 

Regarding the latter, it is easy to detect political agenda in recent national debates about class 
size with Democrats generally favoring class-size reduction and Republicans generally hostile to them. In 
his 1998 State of the Union Address, President Bill Clinton declaimed: 



Now we must make our public elementary and secondary schools the best in the world.... And every parent 
already knows the key-good teachers and small-class size in the early grades.... We will reduce class size 
in the first, second, and third grades to an average of 18 students in a class.® 

Responding to this call, the federal congress set up a modest program, aimed at certain urban school 
districts with high concentrations of poverty, which provided funds for hiring additional teachers during the 
1999 and 2000 fiscal years. This program enabled some of those districts to cut class sizes in the early 
grades, and informal results from those cites indicated gains in student achievement.^^ 

In contrast. Republicans have been lukewarm to extending this program-some apparently 
believing that it is ineffective or is merely a scheme for enhancing the coffers of teachers' unions. As a 
result. Republicans have generally welcomed President George W. Bush's call for an alternative federal 
program focused on high-stakes achievement tests and using results from those tests to sanction schools 



® See Bracey (1995). 

® New York Times (1998). 

® See Cohen, Miller, Stonehill, & Geddes (2000); Naik, Casserly, & Uro (2000). 
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if they do not perform "adequately," and the education reform bill passed by the Congress in 2001 was 
largely concerned with the latter. 

However, the major problems standing in the way of reducing class sizes would seem to be 
practical ones. In many cases, extra teachers would have to be hired if class sizes were cut, and-given 
the looming shortage of qualified teachers to serve our growing public school populations--it may be 
difficult to find those extra teachers let alone the funds to pay their salaries. Furthermore, many schools 
would also have to find or create extra rooms to house the additional classes created by small-class 
programs, and this would require either modifying school buildings or acquiring temporary classroom 
structures. 

In many cases, meeting needs such as these would mean increasing the size of public school 
budgets, a step abhorred by fiscal conservatives and those who are critical of public education, so the 
latter have been tempted to argue that other reforms would be more "effective" and would cost less than 
reducing class sizes. In response to such claims, various studies have been published trying to estimate 
the costs of class-size-reduction programs or comparing their estimated costs with those of other 
proposed reforms. Unfortunately, studies of these types must make questionable assumptions,^® so the 
results of their efforts have not been persuasive, and as Charles Achilles points out, some schools can cut 

29 

class sizes in the early grades by merely reallocating resources. 

Nevertheless, reducing the size of classes for students in the early grades often requires 
additional funds, although sizable educational benefits result when this step is taken. Students from all 
walks of life reap long-lasting advantages, but students from educationally disadvantaged groups benefit 
particularly. Indeed, if we are to judge by available evidence, no other educational reform has yet been 
studied that would provide such striking benefits, so debates about reducing class sizes are basically 
disputes about values. If Americans are truly committed to providing quality public education and a level 
playing field for children regardless of background, once they learn about the advantages of small classes 
in the early grades, they will presumably find the funds needed to reduce class size. 



28 

To illustrate, teachers' organizations have long argued for smaller classes, and evidence has appeared showing that teacher 
morale is higher in the small-class context (see, among other sources. Glass & Smith, 1979; or Molnar et al., 1999). This suggests 
that teachers who are assigned to smaller classes may experience more satisfaction, suffer less burnout, and be less likely to resign 
from the field. In a decade when turnover in the teaching profession is high and a shortage of qualified teachers looms, reducing 
class sizes may actually be more cost effective than trying to train and hire ever-increasing numbers of new teachers, but this 
possibility seems not to have been explored yet by those trying to estimate costs for small-class programs. 

Achilles (1999), pp. 141-161. 
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